cosh 2
Appendix: Representing Hyperbolic Space Accurately using Multi-Component Floats
Renormalize algorithm to reduce the number of components.Algorithm 4: Scale-Expansion, modified from [4] Input: m-components expansion (a More importantly, we show in Alg. At the start of the training, we train models with an initial "burn-in" phase We mention an interesting tuning result here, take the training of the halfspace model over the WordNet Mammal for example, we varies the learning rates for different batchsize as shown in Table. 1. We found that, if trained with a larger batchsize, when the learning rate is adjusted (increased) properly, the embedding performance of the converged model with a large batchsize can nearly match the best performance of the converged model with a smaller batchsize.
- North America > United States > Oregon > Multnomah County > Portland (0.04)
- North America > Canada (0.04)
- North America > United States > Oregon > Multnomah County > Portland (0.04)
- North America > Canada (0.04)
Gaussian Mixture Model with unknown diagonal covariances via continuous sparse regularization
Giard, Romane, de Castro, Yohann, Marteau, Clément
This paper addresses the statistical estimation of Gaussian Mixture Models (GMMs) with unknown diagonal covariances from independent and identically distributed samples. We employ the Beurling-LASSO (BLASSO), a convex optimization framework that promotes sparsity in the space of measures, to simultaneously estimate the number of components and their parameters. Our main contribution extends the BLASSO methodology to multivariate GMMs with component-specific unknown diagonal covariance matrices-a significantly more flexible setting than previous approaches requiring known and identical covariances. We establish non-asymptotic recovery guarantees with nearly parametric convergence rates for component means, diagonal covariances, and weights, as well as for density prediction. A key theoretical contribution is the identification of an explicit separation condition on mixture components that enables the construction of non-degenerate dual certificates-essential tools for establishing statistical guarantees for the BLASSO. Our analysis leverages the Fisher-Rao geometry of the statistical model and introduces a novel semi-distance adapted to our framework, providing new insights into the interplay between component separation, parameter space geometry, and achievable statistical recovery.
- Europe > France (0.04)
- Europe > Switzerland > Basel-City > Basel (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East > Jordan (0.04)
Reinforcement Learning with Verifiable Rewards: GRPO's Effective Loss, Dynamics, and Success Amplification
Group Relative Policy Optimization (GRPO) was introduced and used successfully to train DeepSeek R1 models for promoting reasoning capabilities of LLMs using verifiable or binary rewards. We show in this paper that GRPO with verifiable rewards can be written as a Kullback Leibler ($\mathsf{KL}$) regularized contrastive loss, where the contrastive samples are synthetic data sampled from the old policy. The optimal GRPO policy $\pi_{n}$ can be expressed explicitly in terms of the binary reward, as well as the first and second order statistics of the old policy ($\pi_{n-1}$) and the reference policy $\pi_0$. Iterating this scheme, we obtain a sequence of policies $\pi_{n}$ for which we can quantify the probability of success $p_n$. We show that the probability of success of the policy satisfies a recurrence that converges to a fixed point of a function that depends on the initial probability of success $p_0$ and the regularization parameter $\beta$ of the $\mathsf{KL}$ regularizer. We show that the fixed point $p^*$ is guaranteed to be larger than $p_0$, thereby demonstrating that GRPO effectively amplifies the probability of success of the policy.
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.52)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)
Interaction Screening and Pseudolikelihood Approaches for Tensor Learning in Ising Models
Liu, Tianyu, Mukherjee, Somabha
In this paper, we study two well known methods of Ising structure learning, namely the pseudolikelihood approach and the interaction screening approach, in the context of tensor recovery in $k$-spin Ising models. We show that both these approaches, with proper regularization, retrieve the underlying hypernetwork structure using a sample size logarithmic in the number of network nodes, and exponential in the maximum interaction strength and maximum node-degree. We also track down the exact dependence of the rate of tensor recovery on the interaction order $k$, that is allowed to grow with the number of samples and nodes, for both the approaches. Finally, we provide a comparative discussion of the performance of the two approaches based on simulation studies, which also demonstrate the exponential dependence of the tensor recovery rate on the maximum coupling strength.
- North America > United States (0.14)
- Europe > Switzerland > Basel-City > Basel (0.04)
A General Algorithm for Solving Rank-one Matrix Sensing
Qin, Lianke, Song, Zhao, Zhang, Ruizhe
Matrix sensing has many real-world applications in science and engineering, such as system control, distance embedding, and computer vision. The goal of matrix sensing is to recover a matrix $A_\star \in \mathbb{R}^{n \times n}$, based on a sequence of measurements $(u_i,b_i) \in \mathbb{R}^{n} \times \mathbb{R}$ such that $u_i^\top A_\star u_i = b_i$. Previous work [ZJD15] focused on the scenario where matrix $A_{\star}$ has a small rank, e.g. rank-$k$. Their analysis heavily relies on the RIP assumption, making it unclear how to generalize to high-rank matrices. In this paper, we relax that rank-$k$ assumption and solve a much more general matrix sensing problem. Given an accuracy parameter $\delta \in (0,1)$, we can compute $A \in \mathbb{R}^{n \times n}$ in $\widetilde{O}(m^{3/2} n^2 \delta^{-1} )$, such that $ |u_i^\top A u_i - b_i| \leq \delta$ for all $i \in [m]$. We design an efficient algorithm with provable convergence guarantees using stochastic gradient descent for this problem.
- North America > United States > Texas > Travis County > Austin (0.04)
- North America > United States > New York > New York County > New York City (0.04)
A Quasi-centralized Collision-free Path Planning Approach for Multi-Robot Systems
This paper presents a novel quasi-centralized approach for collision-free path planning of multi-robot systems (MRS) in obstacle-ridden environments. A new formation potential fields (FPF) concept is proposed around a virtual agent, located at the center of the formation which ensures self-organization and maintenance of the formation. The path of the virtual agent is centrally planned and the robots at the minima of the FPF are forced to move along with the virtual agent. In the neighborhood of obstacles, individual robots selfishly avoid collisions, thus marginally deviating from the formation. The proposed quasi-centralized approach introduces formation flexibility into the MRS, which enables MRS to effectively navigate in an obstacle-ridden workspace. Methodical analysis of the proposed approach and guidelines for selecting the FPF are presented. Results using a candidate FPF are shown that ensure a pentagonal formation effectively squeezes through a narrow passage avoiding any collisions with the walls.
Noise-induced degeneration in online learning
Sato, Yuzuru, Tsutsui, Daiji, Fujiwara, Akio
The gradient descent is the simplest optimisation algorithm represented by gradient dynamics in a potential. When the input data is finite, gradient descent dynamics fluctuates due to the finite size effects, and is called stochastic gradient descent. In this paper, we study stability of stochastic gradient descent dynamics from the viewpoint of dynamical systems theory. Learning is characterised as nonautonomous dynamics driven by uncertain input from the external, and as multi-scale dynamics which consists of slow memory dynamics and fast system dynamics. When the uncertain input sequences are modelled by stochastic processes, dynamics of learning is described by a random dynamical system. In contrast to the traditional Fokker-Planck approaches [5, 15], the random dynamical system approaches enable the study not only of stationary distributions and global statistics, but also of the pathwise structure of stochastic dynamics. Based on nonautonomous and random dynamical system theory, it is possible to analyse stability and bifurcation in machine learning.